|
In information theory, the typical set is a set of sequences whose probability is close to two raised to the negative power of the entropy of their source distribution. That this set has total probability close to one is a consequence of the asymptotic equipartition property (AEP) which is a kind of law of large numbers. The notion of typicality is only concerned with the probability of a sequence and not the actual sequence itself. This has great use in compression theory as it provides a theoretical means for compressing data, allowing us to represent any sequence ''X''''n'' using ''nH''(''X'') bits on average, and, hence, justifying the use of entropy as a measure of information from a source. The AEP can also be proven for a large class of stationary ergodic processes, allowing typical set to be defined in more general cases. ==(Weakly) typical sequences (weak typicality, entropy typicality)== If a sequence ''x''1, ..., ''x''''n'' is drawn from an i.i.d. distribution ''X'' defined over a finite alphabet , then the typical set, ''A''''ε''(''n'')(''n'') is defined as those sequences which satisfy: : Where : # # #Most sequences are not typical. If the distribution over is not uniform, then the fraction of sequences that are typical is :: ::as ''n'' becomes very large, since For a general stochastic process with AEP, the (weakly) typical set can be defined similarly with ''p''(''x''1, ''x''2, ..., ''x''''n'') replaced by ''p''(''x''0''τ'') (i.e. the probability of the sample limited to the time interval ()), ''n'' being the degree of freedom of the process in the time interval and ''H''(''X'') being the entropy rate. If the process is continuous-valued, differential entropy is used instead. Counter-intuitively, most likely sequence is often not a member of the typical set. For example, suppose that ''X'' is an i.i.d Bernoulli random variable with ''p''(0)=0.1 and ''p''(1)=0.9. In ''n'' independent trials, since ''p''(1)>''p''(0), the most likely sequence of outcome is the sequence of all 1's, (1,1,...,1). Here the entropy of ''X'' is ''H''(''X'')=0.469, while So this sequence is not in the typical set because its average logarithmic probability cannot come arbitrarily close to the entropy of the random variable ''X'' no matter how large we take the value of ''n''. For Bernoulli random variables, the typical set consists of sequences with average numbers of 0s and 1s in ''n'' independent trials. For this example, if ''n''=10, then the typical set consist of all sequences that has a single 0 in the entire sequence. In case ''p''(0)=''p''(1)=0.5, then every possible binary sequences belong to the typical set. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Typical set」の詳細全文を読む スポンサード リンク
|